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ABSTRACT 

We estimate the stellar parameters of late K and early M-type Kepler target stars. We obtained 
medium resolution visible spectra of 388 stars with Kp — J > 2 (~ K5 and later spectral type). 
We determine luminosity class by comparing the strength of gravity-sensitive indices (CaH, K I, Ca 
II, and Na I) to their strength in a sample of stars of known luminosity class. We find that giants 
constitute 95±1% of the bright (Kp < 14) red Kepler target stars, and 7±2% of dim (Kp < 14) stars, 
significantly higher than fractions based on the stellar parameters quoted in the Kepler Input Catalog. 
The KIC effective temperatures are systematically (135^40 K) higher than temperatures determined 
from fitting our spectra to PHOENIX stellar models. Through Monte Carlo simulations of the Kepler 
cxoplanet candidate population, we find that there are 0.38 ± 0.08 planets per star when giant stars 
are properly removed, somewhat higher than when a KIC log g > 4 criteria is used (0.27 ± 0.05). 
Lastly, we show that there is no significant difference in g — r color (a probe of metallicity) between 
late-type Kepler stars with transiting Earth-to-Neptune sized exoplanet candidates and dwarf stars 
with no detected transits, in line with what is seen for solar-type stars. We show that a previous 
claimed offset between these two populations is most likely an artifact of including a large number of 
misidentified giants. 

Subject headings: Stars, Extrasolar Planets, M Stars, Kepler, Giant stars, Dwarf stars 



1. INTRODUCTION 

The NASA Kepler mission ()Borucki et al.l 120101 ) has 
ushered exoplanet science into a new phase of analy- 
sis based on the statistics of large samples. Among the 
more elementary statistics are the frequency of planets 
around stars, the distribution of planet size (or mass), 
and any correlation between the presence of planets 
and the properties of the host stars. These are im- 
portant, if sometimes ambiguous, constraints on mod- 
els of planet formation and evolution. These properties 
are best established for solar-type stars (late F through 
early K spectral types) because many nearby represen- 
tatives are bright enough for ground-based Doppler ra- 
dial velocity observations, and because they constitute 
the vast majority of Kepler targets. More than 15% 
of dwarf stars have close-in (^0.25 AU) planets with 
orbit al periods less than 50 days (|Howard et al.l 120101 
1201 lh and this fra ction increases with orbital period 
(|Mavor et al.l [20TlT ) . The same authors find that planet 
frequency is inversely related to planet mass or radius, 
with "super-Earths" outnumbering Jupiter-size planets 
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by more than an order of magnitude. Around solar-type 
stars, the presence of giant planets is strongly correlated 
with super-solar metallicity ([Gonzalez 1997; S antos et al.l 
12001 IFischer fc Valentil 120051: 1 Johnson et all 120101 ), but 
this correlation does n o t appear to hold for smaller plan- 
ets (ISousa et al.M2008t IBouchv et all 120091: IMavor et all 

Very cool (late K and early M type) dwarf 
stars have become popular targets of planet searches 
dCharbonneau et all 120091: iVogt et al.l 120101: iBean et al.l 
120101: IApps et alj 120101: IFischer et al.l I2012D . Planets 
around cool stars are easier to detect because of the stars' 
smaller masses and radii. Furthermore, because these 
stars are less luminous, close-in and thus detectable plan- 
ets can still orbit within the "habitable zone," where an 
Earth-like planet would avoi d the "snowball" or runaway 
greenhouse climate states (Ga idos et al.l 120071 ). How- 
ever, the statistics of planets around these stars are more 
poorly established. These stars are underrepresented in 
magnitude- limited Doppler surveys as well as the Kepler 
target list. Only 2% of Kepler target stars are classi- 
fied as possible M types (cooler than 4000 K) , whereas 
>70% of all stars within 20 pc are M dwarfs (|Chabrierl 
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I2003D . Nevertheless. iHoward etaTI (|201lD find that the 
frequency of candidate planets around Kepler stars rises 
with decreasing effective temperature through at least 
early K-type (-5000 K). iSchlaufman fc Laughlinl (poTTf ) 
used colors from Kepler Input Catalog (KIC) photome- 
try as proxies for metallicity and claimed that late-type 
(i.e., J — H ss 0.62) stars hosting candidate transiting 
planets have significantly redder g — r colors - and hence 
are more metal-rich - than stars for which transits have 
not been detected. Becaus e very few of these ca ndidates 
should host giant planets (j Johnson et al.|[2010f) . this re- 
sult contradicts previous findings (see above). 

Reliable stellar parameters are a prerequisite for robust 
statistical analysis of planets, especially transiting plan- 
ets. These are needed not only for stars for which planet 
candidates have been detected (referred to as Kepler Ob- 
jects of Interest or KOIs), but also for the target sample 
as a whole. For example, the radius of a planet produc- 
ing a given transit depth is proportional to the radius of 
its host star. Likewise, the transit signal produced by 
a planet of a given radius - and hence its detectability 
around a star in the survey - also depends on stellar ra- 
dius. If some target stars are actually larger (or even 
giant stars), then planets are less likely to be detected 
in that sample, which means that the most likely occur- 
rence rate of those planets is higher. For M dwarf stars 
in general, and particularly for the coolest Kepler target 
stars, parameters s uch as radius are uncertain or even 
very unreliable (e.g. IJohnson et al.|[20TTI: Irvluirhca d et al.l 

MM- 

Kepler targets are selected from the Kepler input cat- 
alog (KIC) based on the ability of the mission to find 
transiting planets, especially in the habitable zone; ide- 
ally, the target catalog should consist exclusively of dwarf 
stars for which the signal of a transiting pla net is largest, 
and e xclude sub-giant and giant stars. IBrown et al.l 
(|201lD used D51 (Mg lb line) photometry and Sloan g- 
D51 colors to exclude giants, however this is also sensitive 
to temperature and metallicity and is not available for 
all targets. The KIC includes Sloan (griz) and 2MASS 
(JHK) magnitudes; stellar parameters are estimated by 
forward modelin g of the photometr i c data with the syn- 
thetic spectra of iCastelli fc Kuruczl (|2004l ). and effective 
temperature T e //, gravity log g, and metallicity [M/H] as 
free parameters. Stellar mass and distance are then esti- 
mated using luminosity , T e ff, and log q from the stellar 
evolutionary models of IGirardi et al.l (20001 ). The com- 
bination of stellar mass and log g then yields a stellar 
r adius. 

IBrown et al.l (|2011[) state that KIC radius estimates 
have average errors of 35% and are not r eliable for stars 
cooler than 4000 K. IHoward et al.l (|2011D point out that, 
because of the difficulty in constraining log g, the radii 
of some stars, particularly sub-giants, may be underesti- 
mated by a factor of 2 or more in the KIC- IGaidoseTaTI 
(2012) found that consistency between the Kepler candi- 
date planet catalog and the M2K Doppler survey could 
be achieved if the former was incomplete compared to 
estimates based on KIC radii. They further point out 
that Kepler planet candidates were conspicuously sparse 
among late K stars with colors t hat are shared by both 
dwarfs and giant stars. Finally, Muirhc ad et al.l ([201 ll ) 
showed that KIC estimates for the radii of many Ke- 
pler M dwarfs hosting planets are smaller than KIC 



values by as much as a factor of two. This is not 
to be confused with the 5-10% radius discrepancy be- 
tween the most refined models and measurements by in- 
terferometry and observations of eclipsing bin aries (e.g. 
lLopez-Santiago et al.ll2010HKraus et al.ll2011[ ). 

IBrown et al.1 ([2011 ) regard the uncertainty of their 
metallicity estimates to at l east ±0.4 dex. Instead, 
ISchlaufman fc Laug hlin (2011) use Sloan g— r colors for a 
given J—H range as an indicator of the amount of Fe line 
blanketing at blue wavelengths, and hence metallicity. 
They construct mean g — r vs. J — H loci for KOIs and 
Kepler stars without identified transits. They find a sig- 
nificant difference between the two populations for stars 
with J — H w 0.62, correspondin g approximately to lat e 
K-type. This is inconsistent with M uirhead et al.l (|2011[) . 
who estimate metallicity using the equivalent widths of 
atomic lines in the K (2.2 /j,m) band and find that KOIs 
might be slightly metal-poor compared to the average 
target. However, K giants are sign ificantly bluer tha n 
dwarfs in g - r, for the same J-H (lYannv et al.ll2009l) . 
Thus, contamination of the Kepler target sample by gi- 
ants would shift the locus of the target stars to bluer g— r, 
but not the KOI locus, as planets are less detectable, or 
com pletely undetectable around gi ant stars. Realizing 
this. ISchlaufman fc Lau ghlin (|2011f) constructed and an- 
alyzed artificial mixed data sets to estimate that a 10- 
30% contamination by giants would also produce the ob- 
served offset. 

Moderate resolution spectra are nearly always suffi- 
cient to distinguish K and M giants fro m their dwarf 
cousins. In addition iCiardi et al.l (|201 If ) showed that 
some giant stars could be identified based on JHK pho- 
tometry alone. In Section [2] we present spectroscopy of 
a representative sample of late-type Kepler target stars. 
In Section [3] we use both spectroscopy and photometry 
to derive luminosity classes and effective temperatures. 
In Section 3] we use this information, plus radii based 
on stellar evolutionary models, to refine the frequency of 
planets around these stars. In Section[5]we calculate and 
compare the mean g — r colors (as metallicity proxies) of 
KOIs and a bona fide dwarf sample. 

2. SAMPLE, OBSERVATIONS, AND REDUCTION 

Because derived KIC parameters may not always 
be reliable, we instead select our sample using avail- 
able photometry of the Kepler field. A sample of 
stars with V - J > 2.5 will include > 98% of all 
M dwar fs, as well as most of the K7 dwarfs in the 
sample ([Lepine fc Gaidosl I21LT1 . Although 2MASS J 
magnitudes are available for almost the entire sample, 
V magnitudes are not. Kp, however, is available for 
all target stars. For M0 stars, K P — V ~ -0.43 (see 
http://keplergo.arc.nasa.gov/CalibrationZeropoint.shtml) 
so we conservatively select stars with Kp — J > 2 ob- 
served in Quarters 0-6 by Kepler and retrieved from the 
Multimission Archive (STScI). We remove stars with a 
contaminating star within 1 arc second. 

Bright Kepler target stars are selected in a funda- 
mentally different way from dim stars (see Figure [TJ 
iBatalha et al.l feOlO). We separat ely analyze dim ( Kp > 
14) and bright (K P < 14) stars. iBessell fc Brett] (| 19881 ) 
showed that giant stars tend to have more extreme J — H 
colors than their dwarf counterparts. However, we want 
to investigate how misidentified giant stars in the KIC 
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Fig. 1. — Kepler magnitude vs. J — H color for Quarter 0-6 
Kepler target stars with Kp — J > 2. Our observing bins (see 
Section [2]l are marked by blue lines, and red targets denote that 
we observed this star. There is a clear difference between how 
bright (Kp < 14) and dim (Kp > 14) stars are selected, resulting 
in a very different distribution of colors. For this reason we treat 
bright and dim Kepler target stars as two independent samples. 
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Fig. 2. — Distribution of KIC effective temperatures and Kp — J 
colors for target stars. The bulk of the stars in our spectroscopic 
sample are M dwarfs (T e ft < 4000 K) if we assume KIC T e ff 
values are accurate. Note that not all stars have effective temper- 
atures listed in the KIC; points lacking T e tf values are not shown 
in the center plot or bottom histogram, but are included in the 
Kp — J histogram. 

are distributed with J — H color. Thus we further subdi- 
vide our sample into four J — H color bins: J — H < 0.70, 
0.70 < J - H < 0.76, 0.76 < J - H < 0.82, and 
0.82 < J - H for the bright stars and J - H < 0.62, 
0.62 < J - H < 0.65, 0.65 < J - H < 0.68, and 
J — H > 0.68 for the dim stars. We observe a randomly 
selected sample of stars within each bin, although we 
have more observations in the bright bins because they 
are more observationally accessible. In total we observed 
388 stars covering 6.5 < K P < 16, 0.40 < J-H < 1.00 
and KIC effective temperatures 3200 < T e //< 5050 K. 
We show the distribution of observed targets in J — H 
and KIC T e f / space in Figure El 

Observations were obtained between June 16 and Aug 
28 (2011) with the Super Nova i ntegral Field Spectro- 
graph (SNIFS, lLantz et all 12004ft at the University of 
Hawaii 2.2m telescope on Mauna Kea and the Boiler 
and Chivens CCD Spectrograph (CCDS) or the Mark III 
spectrograph (Mklll) at the 1.3m McGraw-Hill telescope 



on Kitt Peak. SNIFS is an optical integral field spectro- 
graph with R ~ 1300 that splits the signal with a dichroic 
mirror into blue (3000 - 5200 A) and red (5000 - 9500 A) 
channels. The images are resampled with microlens ar- 
rays, dispersed with grisms, and focused onto blue- and 
red-sensitive CCDs. Processing of SNIFS data was per- 
formed with the SNI FS pi peline, described in detail by 
lAldering et all (I2006T) and IPereira etaLl d20Toh . SNIFS 
processing includes dark, bias, and flat-field corrections, 
assembling the data into red and blue 3D data cubes, 
and cleaning them for cosmic rays and bad pixels. After 
sky subtraction, the spectra are extracted with a semi- 
analytic PSF model, and wavelengths calibrated with arc 
lamp exposures taken at the same telescope pointing as 
the science data. The CCDS and Mklll spectrographs 
cover 5700 - 9300 A and 4400 - 8300 A with R ~ 1 1 50 and 
~ 2300, respectively. Standard reduction of data taken 
with the CCDS and Mklll was performed with IRAF, 
following the practice of overscan subtraction, division 
by flat field, and extraction of the spectra. Spectra were 
wavelength-calibrated against NeAr comparison arcs. All 
observations (including SNIFS) were flux-calibrated and 
telluric lines were removed based on observations of the 
NO AO primary spectrophotometric standards Feige 66, 
Feige 110, and BD+284211. All spectra had a median 
S/N of > 30 in the 6000-7000A range, and the median 
S/N of all spectra in this range was 50. 

Our spectroscopic set only covers Kp — J > 2.0, but 
we also consider a separate 'photometric sample' that in- 
cludes stars with 0.56 < J - H < 0.66 or K P - J > 2. 
This is done so we can ensure coverage of the s ample 
of late K stars used by iSchlaufman fc Laughlinl (|2011l ) 
(see Section [5|) . The KIC includes infrared (J, H and 
K) photometry from 2MASS (jSkrutskie et al.ll2006l ) and 
visible-wavelength photometry through SDSS griz and 
D51 filters. We add photometry from the Wide-fiel d 
Infrared Survey Explorer (WISE, IWright et al.ll2010ft . 
3.4/im, 4.6/im, 12/xm, and 22/im. WISE-based color re- 
lations can be applied only to part of the Kepler sample, 
as the current WISE data release covers just 50% of the 
Kepler field. 

3. LUMINOSITY CLASS 

We determine luminosity class by comparing the spec- 
tral indices or colors of Kepler target stars to those 
of stars drawn from 'training sets' of known giants or 
dwarfs. We first discuss how we construct our training 
sets. We then explain our choice of indices and color- 
color relations, based on previous work on giant /dwarf 
discrimination and derived empirically from examination 
of the differences between the dwarf and giant training 
set. We use the colors and spectroscopic indices of stars 
in the training sets to construct a likelihood estimator, 
such that we can calculate the likelihood that a given 
star is a giant (or dwarf). That calculation is explained 
in Section l3~4l 

3.1. Training Sets 

We construct an uncontaminated set of dwarf stars 
from a s ample of high proper mo tion-selected late K and 
M stars (jlipine fc Gaidosll2011l ). The brightest (J < 9) 
northern stars in this sample have visible-wavelength 
spectra (Lepine et al. in prep). Spectra from this sample 
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are obtained with the same instruments and reduced in 
the same way as was done for Kepler targets observed for 
this paper, making it an ideal comparison set. Although 
this sample includes more than 1500 spectra, we choose 
to use the 620 targets with spectra from SNIFS/UH2.2m, 
w hich includes the Ca II t riplet (8484 - 8662A). 

iLepine fc Galdosl pOlTI ) use J - H, and H - K col- 
ors, combined with pro per motion from SUPERBLINK 
(jLepine fe Sharal I2005T ) and (f or some targets) paral- 
lax information from Hipp arcos (jvan Leeuwen fc Fantind 
120051 : Ivan Leeuwenl I2007D to remove gian t stars. Based 
on those stars in ILepine fc Gaidoi pill) with parallax, 
we estimate that fewer than 0.5% of the resulting sample 
will be giants. H owever, because of strict cuts in J — H 
and H — K, the ILepine fc Gaidosl (|2011l ) sample is in- 
complete and biased against dw a rfs w ith much redder 
or bluer colors. ILepine fc Gaidosl (|2011l ) also use a color 
cut of V — J > 2.7 to select mostly M dwarfs. This ex- 
cludes some mid- to late-K stars which will be included 
in our K P - J > 2 cut and/or the 0.58 < J - H < 0.66 
color cut for the photometric sample (see Section [2]). We 
therefore add 60 late K and early M dwarfs included in 
the Hipparcos catalog and that have UH2.2m spectra bu t 
lie outside the cuts imposed by ILepine fc Gaidosl (|2011f ). 
These stars are confirmed to be dwarfs by their Hippar- 
cos parallaxes. We also add 150 M dwarfs with spectr a 
from SDSS, including 50 dwarf from lWest et all (|2011h . 
with r — J and J — H colors consistent with our targets 
of interest. We verify that these targets are dwarfs us- 
ing a cut with reduced proper motion, where the reduced 
proper motion in the SDSS g band is: 



TABLE 1 

Definitions of Spectroscopic Indices 



H g = g + 51og^i + 5, 



(1) 



and [i is the proper motion in arcsec yr" 1 . This quan- 
tity is similar to the absolute magnitude, such that giant 
stars will have much lower reduced proper motions than 
dwarfs of the same color. We only select SDSS stars with 
H g > 2.2(g — r) + 7.0, and fi > 15 arcsec yr _1 , which 
we determine empirically from our UH2.2m targets with 
SDSS photometry. 

Our sample of > 300 giant spect ra is con- 
struct ed from multiple catalo gs, spec ifically Fluks et al. 

(1994) , iDanks fc Dennefeldl (|l99l. I Allen fc Strom 

( 1995) . iSerote Roos et all jl996l) . iMontes et all (|1999h . 
and lLancon fc Wood! (|2000l ) . as well as 80 bright stars 
we observed with UH2.2/SNIFS that are confirmed to 
be giants by Hipparcos. Many spectra have significantly 
higher resolution than our own observations. We con- 
volve these data with a gaussian to match the resolution 
of our own sample to remove any resolution-dependency 
in our results. To include sufficient SDSS photometry, we 
supplement our giant training set by including 200 giant 
stars with spectra from SDSS all with r < 16 and proper 
motions consistent with zero. We require these SDSS 
spectra to have spectroscopic indices consistent with the 
rest of the giant training set. Because we select only 
SDSS stars with indices consistent with indices in spec- 
tra from the rest of the training set, SDSS giant stars 
have no effect on our spectroscopic determination of lu- 
minosity class. Rather, SDSS stars are added only for 
their photometry. 

SDSS, 2MASS and WISE colors are available for much 
of our giant and dwarf training set, however, most lack 



Index Name 


Index Location [A] 


Continuum Region [A] 


Na I (a) 


8172-8197 


8170-8173, 8232-8235 


Na I (b) 


5868-5918 


6345-6355 


Ca II 


8484-8662 


8250-8300, 8570-8600 


Ba II/Fe I/Mn I/Ti I 


6470-6530 


6410-6420 


K I 


7669-7705 


7677-7691, 7802-7825 


CaH 2 


6822-6838 


7042-7046 


CaH 3 


6950-6990 


7042-7046 


TiO 5 a 


7126-7135 


7042-7046 



"Because TiO 5 has minimal gravity dependence, we measure 
other spectroscopic indices with respect to the TiO 5 band. 
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Fig. 3. — SNIFS spectra of an M dwarf (top) and M giant (bot- 
tom) of similar T e ff (~ 3650 K) and magnitude (Kp ~ 14). Ap- 
proximate regions for each of the six indices we use for giant / dwarf 
discrimination are marked in light grey. B 1 refers to a mix of 
atomic lines (Ba II, Fe I, Mn I, and Ti I) which overlap at the 
SNIFS resolution (~ 1200). The TiO 5 molecular band (marked 
in dark grey) is used as a probe of spectral t ype, although it is 
also sensitive to metallicity (Lcpine et al. 2007). Other atomic and 
molecular lines are generally much weaker in late type giant stars 
HReid fc Hawlevll2005T ). Indeed, the Na I (8172-8197A) and K I 
(7669-7705A) doublets are are barely present in the giant spec- 
trum above while they are both quite strong in the dwarf. 

measurements in the D51 band, which covers the gravity- 
sensitive Mg lb line at 5200A. Instead, we synthesize 
equivalent g — Dbl colors from the spectra of our training 
set. We obtain the zero point for the synthesized colors 
of those stars in our sample which have both spectra and 
g and D51 magnitudes. 

3.2. Spectroscopic Determination of Luminosity Class 

Our determination of luminosity class uses six different 
gravity-sensitive molecular/atomic indices (Table [1] and 
Figured]). Molecular and atomic indices are ratios of the 
average flux levels in a specified wavelength region to that 
of a pseudo-continuum region. Indices are useful for M 
dwarfs where the continuum is poorly defined. The val- 
ues of in most indices are a function of both gravity and 
temperature of the star. To remove this degeneracy we 
compare measured ind ices to the TiO 5 spectral index. 
TiO 5, as defined by iReid et al.l (fl995l). is sensitive to 
spectral type and m etallicity (jWoolf fc Wallersteinll2006l : 
ILepine et al.l [2007). but it has minimal gravity depen- 
dance. 

We show spectra of giant and dwarf stars with sim- 
ilar effective temperatures in Figure |3l with the loca- 
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tion of each feature labeled. As can be seen, atomic 
lines are generally weaker in giants than in dwarfs. In- 
deed the Na I doublet (8172-8197A) and K I (7669- 
7705A) lines are barely dete ctible in giants, but eas- 
ily identifiable in dwarf stars (TTbrres-Dod gcn & Weaver! 
Il99l [Schiavon et al.lll997l IGorlova et al.ll2003D . Molec- 
ular lines provide additional luminosity-dependent spec- 
tral signatures. M etal hydrid e band s, suc h as the CaH 
bands defined by iReid et all ((1995) and iLepine et al.l 
(2007) have been used for luminosity classification, al- 
though they become less useful for stars earlier than K7. 
The calcium t riplet (8484 — 8662A) is a useful indicator 
of gra vity (e.g. lCenarro et aLll2001l ; lKraus &; Hillenbrand! 
2009), especially for M stars which emit comparatively 
more at red wavelengths. Giant and dwarf training sets 
overlaid on Kepler target star indices are shown in Fig- 
ures m 

3.3. Photometric Determination of Luminosity Class 

We can use the available photometry to determine 
the luminosity class o f a much lar g er sam ple of Kepler 
stars lacking spectra. IBrown et al.l (|2011[ ) primarily use 
g — D51 vs. g — r and J — K vs. g — i colors to sep- 
arate Kepler late-type giants from dwarfs. Both giants 
and the coolest dwarfs in the sample have relatively weak 
Mg lb lines, creating overlap between the dwarf and gi- 
ant training sets at red g — r. A similar effect happens 
with J—K. Near-infrared photometry (J, H, K) has long 
been used to separate giants and dwarfs at redder colors 
(|Bessell & Brettl [T988T ). in part due to strong CO and 
weak Na I and Ca I absorption in giant stars. But for K 
and early M stars with J - H < 0.7 and H - K < 0.2, 
the giant and dwarf sequences overlap, creating a siz- 
able region of ambiguity. At mid-infrared wavelengths, 
most giant stars have warm dust emission, leading to sig- 
nificantly redder colors in the WISE bandpasses. Other 
relations can be derived from an examination of our giant 
and dwarf training sets, z — K vs. g — J follows a similar 
distribution to that of J — K vs. g — i, but the giant and 
dwarf sample bifurcates at g — J ~ 3.0, which makes it 
useful for isolating the reddest giants (see Figure [5J . 

3.4. Application of training sets to the Kepler sample 

After each spectral index or color is measured or calcu- 
lated for Kepler targets and both training sets, we iden- 
ti fy stars as gian t s or dw arfs following the same technique 
as lGilbert et al.l (|2006f ). We begin by using the spectral 
index or color measurements of the training stars to pro- 
duce a two-dimensional probability distribution function 
(PDF) for each index (or color). The PDFs are con- 
structed by treating the strength of each index or color 
(henceforth S) as a Gaussian distributed variable with 
respect to X . For spectroscopic determination of lumi- 
nosity class, X is defined to be the TiO 5 band and S 
is one of our gravity-sensitive indices (Na I, Ca II, Ba 
II/Fe I/Mn I/Ti I, K I, or CaH). For photometric deter- 
mination of luminosity class, X is defined as g — J, g — i, 
J — H , g — r, 3.4jum - 22/im, or J — 3.4/im and S is z — K, 
J — K, H — K, g — D51, 4.6/mi - 12^im, or K - 4.6/Ltm, 
respectively. Values of S are binned according to their 
corresponding X value. Bins in X are designed to con- 
tain an equal number of stars (20-25) in each bin, and 
as such they are not equally spaced in X. The mean 



(S) and standard deviation (erg) of the S distribution is 
computed in each bin. The two-dimensional PDF takes 
the form: 



PDF(X, S) = C* P(X) * exp[ 



-(s-s(x)y 

2(as(X)f 



(2) 



where C is a normalization such that the entire PDF 
integrates to 1, and P(X) is the probability distribution 
function for X. P(X) excludes targets with anomalous X 
values that would indicate an improperly included early- 
type star that has erroneous photometry or significantly 
reddened colors. PDFs for both giant and dwarf train- 
ing sets overlaid on Kepler target star indices or colors 
are shown in Figures [4] and [5] for the spectroscopic and 
photometric sets, respectively. 

The likelihood that star i is a dwarf for a given index 
j is: 

Pdwarf 



= log 



P 



giant 



and the likelihood given all indices is: 



(Li) 



1,3 



W 3 



(3) 



(4) 



where Wj is a weighting factor. Weights are calculated 
by determining the efficiency of a given feature at sepa- 
rating giants from dwarfs as a function of X. We take 
a random subsample (half the total sample) from each 
training set, and add Poisson noise to the spectra/colors 
consistent with our observations or given photometric er- 
rors. We then apply Equations [2] - 2] to the subsamples 
using Wj = 1. Values of Wj are then set to be the fraction 
of dwarfs that are correctly identified. To ensure that our 
determinations are not overly sensitive to our weighting 
scheme, we recompute the likelihoods using Wj = 1 for 
all j. In this case no stars with spectra are assigned 
a different luminosity class, however, it does reduce the 
confidence, and it makes a significant difference for the 
photometric determinations. The reason for this is the 
significant overlap between the PDFs of the color metrics 
for giant and dwarf training sets (e.g. 2.3 < g — J < 2.8 
and 1.6 < z — K < 1.9, also see Figure [5]). In overlapping 
regions, indices or colors will give similar probabilities for 
a star being a giant or a dwarf, making the metric less 
useful in giant/dwarf discrimination. Weighting factors 
are set to if any of the relevant indices/colors are miss- 
ing for a given star. We identify all Kepler target stars 
with spectra as a giant or a dwarf with better than 99.7% 
(Li > 2.6 or Li < —2.6) probability. 

For the photometric sample, most stars are placed into 
unambiguous giant or dwarf categories ((Li) > 1.5 for 
dwarfs or (Li) < —1.5 for giants). However, ~ 2% of the 
sample are more ambiguous, most of which lack WISE 
photometry. Colors for a subsample of Kepler stars are 
shown in Figure [5] with PDFs from our giant and dwarf 
training sets overlaid. 

3.5. Giant Star Fraction 

We find that, for the coolest Kepler stars (Kp~J > 2), 
giant stars dominate the bright (Kp < 14) Kepler target 
stars but are relatively rare among dim (Kp > 14) tar- 
gets. The fraction of giants is 95 ± 1% for bright stars, 




Fig. 4. — Measured strengths of each gravity-sensitive spectral feature vs. the strength of the TiO 5 band. The two-dimensional PDFs 
defined by our training set of giants (dashed line) and dwarfs (solid line) are overlaid. Contours of the PDF correspond to 68, 90, and 95% 
intervals for the given training set. We positively identify each star with spectra as a giant or a dwarf with > 99.7% certainty. 



7 ± 2% for dim stars, and 51 ± 3% for the combined 
set (based on our spectroscopy). Photometric assign- 
ments (considering Kp — J > 2) give consistent giant 
fractions: 95 ± 2% for bright stars, 8 ± 3% for dim stars, 
and 59 ±4% for all stars with Kp — J > 2. The fractions 
in each brightness bin do not change significantly when 
we apply a KIC log g > 4.0 cut. The giant fraction be- 
comes 94 ± 4% for bright stars and 5 ± 3% for dim stars. 
However, the fraction of giants for all stars decreases to 
12 ± 3%, due mainly to the large number of stars lacking 
any log g classification (which are removed from this cut) . 
Since giant/dwarf assignments based on spectroscopy are 



very accurate, only counting statistics are considered for 
the spectroscopic sample. For uncertainty estimates from 
the photometric sample, we re-apply our likelihood cal- 
culations using 1000 different subsets of our training sets, 
adding random (Poisson) noise to the photometry, and 
then recalculating the giant fraction in each case. The 
variation in giant fraction is added in quadrature with 
binomial errors. This does not consider systematic er- 
rors (e.g. systematic photometric errors, discrepancies 
between training sets and Kepler target stars, etc). 

4. PLANET FREQUENCY 
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Fig. 5. — Similar to Figure [4] except using gravity-sensitive color-color relations. Contours for the training set PDFs correspond to 68 
and 90% levels. We apply this cut to Kepler target stars with J — H > 0.5 or Kp — J > 2.0, although only a subsample of this set is 
shown for clarity. Most stars fall well inside either the dwarf or giant sequence, however, even when all color relations are used, ~ 2% of 
the sample still have an ambiguous luminosity class assignment. Most of these stars are those lacking photometry in one or more band. 
This is a particular problem for WISE photometry, which only covers ~ 50% of the Kepler field. 



4.1. Counting Estimation 

We first cal culate the frequency of planets following 
the method of iGaidos et al.l (|2012f) . The most probable 
mean frequency of the ith planet in the population of 
j = 1...N stars (/, 



is: 



fi 



N 

J'=l 



(5) 



where dij is the probability of detection if the planet is 
transiting the jth star, and pi.j is the geometric proba- 
bility of a transit. For small planets and nearly circular 
orbits, 



P 



0.238P~ 2/3 A/* 1/3 i?„, 



(6) 

and i?* are 



where P is the orbital period in days and M* 
the stars mass and radii in solar units. 

Values for M* and i?* are computed by interpolating a 
grid of stellar radii/masses from the Dartmouth Stellar 
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Evolution Database (DSEP iDotter et all I2008D at esti- 
mated values of T e jf, [Fe/H], and age. We use DSEP 
because radii and masses derived from their isochrones 
are in good agreement (< 0.02 RMS deviation in radius) 
with cur rent empirical measurements of eclipsi ng binary 
systems ([Potter et alJl2008t IPeiden et al.ll2Qlll ). 

For exoplanet h o sts w e use the metallicities given 
in iMuirhead et al.l (|2012[) . but for field stars metallic- 
ities are drawn from a random gaussian distribution 
of metallicities with [Fe/H] = —0.7 and <T\Fe/H] — 
0.20. This distribution is designed to be consistent with 
the distribution of M dwarfs in the solar neighb orhood 
pohnson fc Appsll2009t iCasagrande et al.l 120111 ). Ages 
are assigned randomly assuming a constant star forma- 
tion rate (excluding ages < 100 Myr). However, since 
M dwarfs do not change significantly while on the main 
sequence, our results are not changed when we fix all 
ages to 5 Gyr. The resulting stellar radii from the PSEP 
grid are used i n conju nction with values of Rp/R* from 
iBorucki et al.l (|2011aD to compute planetary radii. 

Estimates of T e f f can be inferred from our optical spec- 
tra. We compare our visible spectra to a grid of models 
of K- and M-dwarf spec tra generated by th e BT-SETTL 
version of PHOENIX ([Allard et al.l [2010h . Retails of 
the comparison, sub-grid interpolation, and error calcu- 
lations are described in Lepine et al. (in prep) . The grid 
of models spans T eff of 3000-5000 K in steps of 100 K, 
log g values of 0.0-5.0 in steps of 0.5 dex, and metallic- 
ities of [M/H] = -1.5, -1, -0.5, 0, +0.3, and +0.5. a/Fe 
is taken to be solar. We report the T e ff of the best-fit 
interpolated model, and the standard deviation of T e ff 
among the set of interpolated models that are nearby in 
parameter space. 

Our calculated values of T e / / are shown in F ig[6]vs. the 
temperature given in the KIC ([Brown et al.l l2011). Our 
temperatures are systematically lower than KIC temper- 
atures by 135^30 K for the dwarf stars, and 226^29 ^ 
for the giant stars. Errors are calculated by bootstrap 
resampling. This is consistent with other determ i nation s 
using the atmospheric models of lAllard et al.l (|2010f ). 
including other deter minations on Kepler KOI stars 
(|Muirhead et al.l [2012f ). Our calculated temperatures is 
tightly correlated with KIC temperatures. When KIC 
temperatures are corrected for our observed offset, the 
standard deviation of the difference in calculated temper- 
atures (a kic -Phoenix) is 90 K, suggesting that the KIC 
temperatures for low-mass stars are more precise bu t are 
less accurate than suggested by I Brown et al.l ([201 If ). For 
field stars with visible-wavelength spectra, we adopt our 
calculated T e ff values, and for stars with exoplanet can- 
didates we use the T e j f from IMuirhead et"aT1 (|2012f ). For 



Giants 



l eff 

the remaining stars we adjust the KIC effective tempera- 
tures of Kepler stars downward randomly by 135^30 K to 
keep the temperatures consistent with those of the KOI 
stars and those with spectra in our sample. This offset is 
randomized to account for errors in the systematic differ- 
ence between temperatures calculated from our spectra 
and those listed in the KIC. 
We consider a planet detected (dij — 1) if: 



S/N 




> 10, 



(7) 



a CDPP V 30 

where S is the transit depth, N is the number of tran- 
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Fig. 6. — Effective temperatures computed by fittin g our spectra 
to mo dels from the BT-SETTL version of PHOENIX (Allarct et al.l 
2010) as a function of the KIC assigned effective temperature for gi- 
ants (top) and dwarfs (bottom). The dotted line indicates equality. 
Errors are estimated as part of our model fitting procedure (error s 
on KIC temperatures are taken to be 135 K ([Brown et al . 120111 ). 
For both giants and dwarfs there is a clear 100-200 K offset between 
our spectroscopically determined temperatures and the KIC tem- 
pe ratures. This is most lik ely a consequence of the models used, 
as Castelli & Kurucz (2004) models used to fit KIC photometry to 
effective temperatures is unreliable below 4000 K. 



sits detected over the observational period, t is the tran- 
sit duration in minutes, and ucdpp is the 30 minute 
combined differential photometric precision (CPPP) of 
Kepler . We use Quarter 0-2 30 minute CPPP values 
from Kepler. O ur detection th r eshold S/N — 10 matches 
what is used bv lHoward et al.l (120111). b ut is higher than 
what is used by IBorucki et all (I2011al) to cut down on 
the number of false positives and/or highly inaccurate 
transit parameters. 

We compute the frequency of planets with 2i? ffi < 
Rp < 32q) and P < 50 days around stars with 3400 < 
T e ff < 4100 using Equations [5] - We calculate the 
standard deviation of the frequency using a Monte Carlo 
analysis. Stellar parameters are p erturbed randomly (see 
above) accounting for errors from IMuirhead et al.l (2012) 
on KOI metallicity and T e //, and random errors from de- 
rived from T e f f fits (see Figure to our spectra. Other 
stars are given a random error of 90 K. We perturb tran- 
sit para meters Rp/R* and pe riod according to errors 
given by IBorucki et al.l (|2011bD . Planetary radii are re- 
calculated from perturbed values of Rp/R* and 



Characteristics of Kepler Target Stars 



9 



We remove planets from the KOI sample usin g the fa lse 
positive probabilities from iMorton fc Johnson! (|2011[) (a 
planet candidate with a 5% false positive probability is 
removed in 5% of the simulations). We remove giant stars 
from the sample using the calculated photometric likeli- 
hoods (Section |3.3[) for each star, such that a star with a 
10% likelihood of being a giant star will be removed from 
the sample in 10% of the Monte Carlo (MC) simulations. 
This also applies to stars with detected planet candi- 
dates, causing the planet to be removed, i.e. we consider 
the planet detection to be a false positive if the star is a 
giant. We run an additional set of MC simulations fol- 
lowing the same criteria, but using the K IC log q > 4.0 
criterio n, intended as a comparison with iBorucki et al.l 
(|20lTbl) 

We find that there are 0.36 ± .07 planets (with 2Rq < 
Rp < 32i?0 and P < 50 days) per star in the temper- 
ature range 3400 < T eff < 4100. This value is slightly 
lower (by 0.09 planets per star) if giant stars are not 
properly removed. For comparison we run an additional 
Monte Carlo simulation but only remov e giant stars with 
KIC log g > 4.0 as in Howard et all (|20TTI ). To test 
how our results depend on our choice of stellar radii 
model (DSEP ) we also run two simulations using the 
Yale-Yonsei ([Han et al.ll2009l ) isochrones: one with giant 
stars removed as explained above and another removing 
just giants with KIC log g > 4.0. The resulting Monte 
Carlo distributions are shown in Figure [7] 

4.2. Likelihood estimation 

We also perform a maximum likelihood estimation of 
the fraction of stars with planets with rad ii 2f? m < R < 
32Rffi and orbital period P < 50 d (see Howard et al.l 
120111 for a similar analysis). For discrete, binomial (de- 
tection or non-detection) events, the likelihood is ex- 
pressed as: 

D ND 

L = Y[ P] xl[(l-p k ), (8) 

3 k 

where the first product is of detections, the second is of 
non-detections, and pi is the probability that a planet 
with properties in the appropriate ranges will be found 
around the ith star. For this formulation, we have as- 
sumed that p < 1. We adopt the specific power-law 
form dN = CR~ a P-' 3 d\nR- d\nP for the intrinsic dis- 
tribution of planets. If both a and j3 are > then the 
normalization factor C is given by: 



G 



(R-° R- a ) (P-P P : 



(9) 



where / is the total fraction of stars with such planets. 
We do not model multi-planet systems; that level of anal- 
ysis is not justified given the large uncertainties in our 
parameters. 

Following the usual procedure, we maximize the loga- 
rithm of L: 

D 

In L = [In C - a In Rj + InDj (Rj ,Pj)] 

3 

ND 

+ ^ln[l-CF fe (a,/3)] (10) 



where Dj(Rj,Pj) is the probability of detecting the jth 
planet around its host star (note Dj(Rj,Pj) = djPj, see 
Equation [5] and [6]) , including the geometric factor, and 

/•R2 rP2 

F k (a,/3) = / R- a p- l3 D k (R,P)d\nR-d\nP 

(11) 

If the detection rate is low, then: 

In L w ^2 [In C - a In R 3 - /3 In Rj + In D, (Rj , Pj)] 



ND 



(12) 



We then substitute Equation [5] for C. Ignoring terms 
that do not depend on a, and thus do not affect its 
maximum likelihood value, we find the following quantity 
must be maximized: 

In L a = Y [In a - In (i?" Q - R~ a ) - a In Rj] - 



faf3j: k D F k (a,(3) 



(R^ a -R^ a ) (p; 



Likewise, 



In R 



13 _ p-/3 



Po 



(13) 



fikiPj 



faf3^ D F k (a,(3) 



R2 a ) (Px 



_ p-n 



(14) 



The simultaneous solution for the fraction of stars with 
planets is found by maximizing the terms that depend 
on / and is simply 



/ = 



n p (i?r Q - i? 2 -«) (p-p - p- 

aPY^ D F k {a,P) 



(15) 



where N p is the number of detected planets. Equation 
1151 immediately suggest a reduction in the last terms of 
Equations Q2] and [TJ] to N p , which is independent of a 
and j3 and can be ignored. Tests with artificial Monte 
Carlo data sets suggest that the recovered values of a 
are biased downwards, but that / is robustly recovered. 

Because there are too few systems in our sample for 
adequate treatment, we fix (3 = with a cut-off at 
P\ = 1 d, consistent with the findings of previo us anal- 
yses (jCumming et al.ll2008t iHoward et al.l 1201 lh . Equa- 
tion [13] becomes: 



N p (R^ a -R^ a )HP2/Pi) 



aE k D Fk(a,f3 



0) 



(16) 



Using the cool KOIs defined here, stellar parameters 
derived as explained above, and Monte Carlo data sets 
generated by sampling with replacement, we find that 
/ = 0.34 ±0.08, consistent with our previous calculation. 
As before, we repeat our Monte Carlo simulation but 
only removing giant stars with KIC log g > 4, and 
another run using the Yale-Yonsei evolutionary tracks 
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Fig. 7. — Number of planets per star with giant stars removed 
(solid line) or using KIC log g > 4.0, using isochrones from DSEP 
(black) or from Yale-Yonsei (red) calculated by Monte Carlo anal- 
ysis. The bottom pl ot is calculated using likelihood estimation as 
explained in Section I4.2I For both plots, we consider planets with 
radii 2R{g < Rp < 32rg and periods P < 50 days, and stars with 
effective temperatures 3400 < T e ff < 4100. A full description of 
our Monte Carlo analysis is given in Section [4] 

(|Han et al.ll2009f ) instead of those of DSEP. The result- 
ing Monte Carlo distributions are shown in Figure [7J 

5. PLANET-HOST METALLICITIES 

ISchlaufman fc Laughlinl IpOll use<? — r vs. J — H col- 
ors to conclude that late-type ( J — H ~ 0.62) exoplanet 
hosts are redder and more metal-rich than stars with- 
out a transiting planet. Because giant stars have bluer 
g — r colors at a giv en J — H colo r (iBessell fc Brett| [l988t 
IGilbert et all 1200a ISchlaufman fc Laughlidl201lir a sig- 
nificant number of giant star interlopers in their sample 
will cause field stars to appear metal poor. Giant stars 
have stellar radii 10-100 times larger than dwarfs, making 
it much less likely that they will appear as KOIs (with 
the exception of false positives). 

We can test their findings by creating a "pure" dwarf 
sample, and comparing its color distribution to that of 
the KOI sample. Our Kp — J > 2 sample is systemat- 
ically redde r in J - H than the 0.5 6 < J - H < 0.66 
bin used in ISchlaufma n fc Laughlinl (|2011| ). preventing 
us from making a direct comparison. Instead, we con- 
struct samples of giants and dwarfs in the J — H ~ 0.62 
bin based on our photometric determination of luminos- 



ity class. For both the dwarf and giant samples, we se- 
lect Kepler target stars with photometry in all bands 
used in our photometric assignment of luminosity class 
(J, H, K, D51, g, r, and all four WISE bands). We then 
select stars with a > 90% likelihood of being dwarfs based 
on our analysis in Section 13.31 The resulting dwarf sam- 
ple is ~ 2500 stars. This sample may still contain gi- 
ants. To place limits on the giant contamination rate, 
we added Poisson noise to the photometry of both the 
training sets and the Kepler sample, then took random 
subsamples of both training sets and reapplied them to 
the modified photometry of the Kepler sample. We re- 
peated this process 1000 times. By analyzing the number 
of giant stars in each of these new samples we find that 
our dwarf sample is < 1% giant stars at 95% confidence. 
Although this analysis ignores possible systematic errors 
in our technique. 

We use this dwarf sample, following the method of 
ISchlaufman fc Laughlinl (|201lD . to compare the g — r col- 
ors at a given J — H (a proxy of effective temperature) of 
the exoplanet host stars with our dwarf sample. We find 
no significant difference in color between the KOI stars 
and our dwarf sample. Figure [8] shows g — r colors as 
a function of J — H colors for the dwarf, giant, planet- 
host, and KIC log g > 4.0 sample. Unlike the KIC log g 
sample, the locus of our photometrically selected dwarf 
sample is a good match for the KOI sample locus at 
J — H ~ 0.62. For stars with Kp — J > 2.0 we find an 
offset in g — r color of —0.03 ± 0.03 between the spec- 
troscopically confirmed dwarfs and late-type KOI stars 
hosting Earth-to-Neptune sized planets. When we use 
our photometric sample of dwarfs in the J — H ~ 0.62 
bin we find an offset of 0.01 ± 0.02 and we can r u le out 
the offset of 0.08 seen bv ISchlaufman fc Laughlml ([201 If ) 
with > 99.7% certainty. Our photometric selection may 
remove some metal-poor dwarfs. However, even when we 
include stars > 60% likelihood of being dwarfs, which will 
necessarily increase the number of interloping giants, the 
offset is still only 0.02 ±0.02 (consistent with zero offset). 

In spite of the low giant fraction for dim Kepler 
target stars, it is not s ufficie nt to simply repeat the 
ISchlaufman fc Laughlinl (|2011h analysis exclusively for 
stars with Kp > 14. Since Schlauf man fc Laughlinl 
(2011) only examine stars with KIC log g > 4.0, it 
is far more important to investigate the distribution of 
misidentified giants in this color range (i.e. giant stars 
which were given log g > 4 in the KIC). In fact the frac- 
tion of misidentified dim giant stars in their J —H ~ 0.62 
bin is higher (12%), than it is for the Kp — J > 2 star 
sample. We show why this is the case in Figure HI which 
shows the distribution of giants, dwarfs, and misiden- 
tified giants in J — H vs. g — r space. Misidentified 
giants are clustered at 0.58 < J — H < 0.63. Further, 
the misidentified giants in this J — H range are much 
more blue than the dwarfs in the same range. Thus 
by selecting a color bi n cent ered on J — H = 0.62, 
ISchlaufman fc Laughlinl (|20T1 are over-selecting giant 
stars, even after applying a KIC log g > 4 cut (~ 15% of 
this sample are giant stars). This over density of misiden- 
tified giants is the most likely explana t ion fo r the color 
offset seen bv Schlauf man fc La ughlin (2 0111 ). and also 
explains why the same g — r offset is not seen at redder 
J — H colors. 
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Fig. 8. — Median g — r colors as a function of J — H colors 
for Kepler target stars with: Earth to Neptune sized planet can- 
didates (dotted/dashed line, diamonds), KIC log g > 4.0 (solid 
line, asterisks), > 90% likelihood of being dwarfs based on their 
colors (dotted line, triangles), > 90% likelihood of being giants 
(dashed line, circles). The ltr errors are calculated for the me- 
dian in each bin by bootstrap resampling. Bins for all data sets 
are the same, but each point is offset slightly from the bin center 
for clarity. There is a statistically significant offset between the 
KIC log g > 4.0 sample and the planet hosts when we consider 
stars with 0.58 < J — H < 0.66, however, this offset is no longer 
present when misidentified giant stars are removed from the sam- 
ple. Indeed, our dwarf control sample closely tracks the colors of 
the planet-hosting stellar population. 

6. DISCUSSION 

We use visible-wavelength spectra to determine the 
properties of a subset of late-type Kepler target stars. 
We separate giants from dwarfs by comparing our spec- 
tra to those of stars with known luminosity class, and 
determine effective temperatures by comparing with 
PHOENIX model spectra. We extend our results to a 
larger collection of Kepler stars using photometry from 
the KIC, 2MASS, and WISE catalogs. We apply our lu- 
minosity class determinations to refine estimates of the 
frequency of planets around stars with 3400 < T e // < 
4100, and compare the colors - and hence metallicities 
of stars with and without detected Earth and Neptune 
sized planets. We draw four major conclusions: 

1. Among stars redder than Kp — J = 2 (~ K5 
and later), bright (Kp < 14) stars are predominantly 
(95 ±1%) giants, while dim stars (Kp > 14) are predom- 
inantly (93 ± 2%) dwarfs. These fractions do not change 
significantly (94 ± 4% and 96 ± 2% respectively) when we 
consider stars with KIC determined log g > 4.0. Overall 
49 ± 3% of Kepler stars with Kp — J > 2 are giants. 
However, only 12 ± 3% of said stars with KIC log g > 4.0 
are giants. 

2. KIC effective t empe ratures, based on the models of 
ICastelli fc Kuruczl (|2004[ ) and g, r, i, z, J, H, K photom- 
etry, are systematically higher by 135^4° K compared 
to those derived from our own spectra and PHO ENIX 
BT-SETTL atmosphere models (jAllard et al.H201Q[) . 

3. Adopting the temperature scale from BT-SETTL 
and radii/ masses from the D artmouth Stellar Evolution 
Database (jDotter et al.ll2008h . and our giant star identi- 
fication, we find that there are 0.36 ± 0.07 planets with 
radii 2i? e < Rp < 32i? e and periods P < 50 days 
per star in the temperature range 3400 < T e fj < 4100. 
Using the KIC determined luminosity classes leads to a 
somewhat lower planet frequency of 0.27 ± 0.05 planets 



per star. When we use the maximum likelihood method 
to estimate the planet frequency, we get 0.34 ±0.08 plan- 
ets per star. 

4. The g — r colors of exoplanet host stars at J — H ~ 
0.62 are consistent with an unbiased sample of Kepler 
dwarf stars, ruling out any large difference between hosts 
of Earth-to- Neptune sized planets and those without any 
detected planets. 

Surprisingly, there are hundreds of stars in our photo- 
metric sample that could have been easily identified as 
giants with photometry included in the KIC, but were 
assigned log g > 4. The KIC primarily uses g — Dhl 
vs g — r colors to identify giants, and many late-type 
stars with KIC log g > 4.0 have g — D51 vs g — r colors 
consistent with giants (and inconsistent with dwarfs). 

Our calculated giant fracti on is consistent with other 
independent measurements. iGaidos et al.l (1201 2D com- 
pare radial velocity data from M2K ( Apps et alJ l20ldt 
iFischer et al.l f20TTf) to Kepler results and note that the 
completeness of the coolest Kepler target stars may be 
quite low (~ 50%), much of which could be explained by 
an unde restimate of the freq uency of giant stars. Addi- 
tionally, iCiardi et all (|2011| ) find that bright Kepler M 
stars are "predominantly giants, regardless of the KIC 
classification" based on JHK photometry alone. 

Interestingly, we find two KOIs with colors consistent 
with giant stars. KOI 667 and KOI 977 both fall within 
our giant training set in multiple color relations, and well 
outside our dwarf training se t . KO I 977 was identified 
as a giant bv iMuirhead et al~l (|2011l ). and they also note 
that KOI 667 consisted of 5 objects with 6" which may 
be contaminating 2MASS or WISE photometry. One of 
these objects could be an eclipsing binary, diluted by the 
other stars. KOI 667 also has a relatively high (10%) false 
positive probability based on Galactic structure models. 

Our values of T e ff are consistent with results re- 
ported elsewhere also using BT-SETTL, including obser- 
vations of the late-typ e KOIs with near-infrared spectra 
(|Muirhead et al.ll2011[) . These authors find a similar sys- 
tematic offset of 123^32 K between their temperatures 
and KIC assigned tem peratures. KIC temperat ures are 
based on the models of ICastelli fc Kuruczl (120041) and the 
evolutionary tracks of iGirardi et al.l ( 20001 ) . which, al- 
though reliable for solar-mas s stars, are untrus tworthy 
for stars with T eff < 3750 K ([Brown et al.ll201lD . 

Our planet frequenc y estim ate is slightly higher than 
that of iHoward et all (|2011h . who, using results from 
Kepler, find that there are 0.30 ± 0.08 planets per star 
with 3600 < T e ff < 4100. The difference is primar- 
ily due_Jo_j^fiance_on luminosity class determinations 
bv iBrown et alJ (|2011l ). which we find to be inaccurate. 
However, the diff erence is only within lcr. For both our 
work and that of IHoward et al.l (|2011l ). errors are domi- 
nated by the low number of late- type stars (and therefore 
planets around them) in the Kepler field and very high 
random (~ 35%) errors in stellar radii. 

In addition to random errors (e.g. stellar radii and 
R p / i?*) that are included in our Monte Carlo simulation, 
there may be large systematic uncertainties in atmo- 
sphere models and evolutionary tracks, which can change 
the resulting frequency. When we use the Yale-Yonsei 
isochrones, it decreases our planet frequency by ~ 0.09 
planets per star. Interestingly, this difference is similar 
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Fig. 9. — Distribution of giants (green), dwarfs (red), and giants labeled as dwarfs by the KIC (blue) in g — r vs J — H space. The 
histograms on the bottom and right side show the 1-D distribution in each color (coloring matches the center plot). Histograms and 
contours are normalized to 1. The contours make regions where the density is 3 times the median density over the J — H and g — r range. 
Although giant stars cover a range of J — H colors, those that were mislabeled as dwarfs are more concentrated around J — H ~ 0.61, 
manifesting as a jump in the density of the blue sample. 



in size to the random errors in our Monte Carlo analy- 
sis (~ 0.08 planets per star), and the difference between 
proper giant removal and using KIC log g > 4.0 (~ 0.10 
planets per star). This suggests that giant star removal, 
improved stellar characterization of the dwarf stars, and 
use of reliable stellar models of late-type stars are all 
of importance to characterizing the frequency of planets 
around very cool stars. 

The lack of a strong correlation between host-star 
metallicity and the presence of Earth-to-Neptune sized 
planets is consistent wi t h wha t is found for solar-type 
~ (l201lh. 



Mayor ct al. (2011). This also matches the 
Mui rhead et al.l ()201lD . who determine that 



stars, e.g. 
findings of 

among the late-type Kepler exoplanet hosts in our sam 
pie the median [M/H] is -0.11 ± 0.02. This distribution 
is consistent with, or more metal poor than, stars in the 
solar neighborhood (—0.05 to — 0.15 , Uohnson fc Adds 
2009 ; iSchlaufman fe Laughlinl 120101 : iCasagrande et al. 
20111) . A metallicity difference could only be present 



if Kepler target stars are significantly more metal poor 
than stars in the s olar neighborhood. As explained in 
iGaidos et all (|2012t ). this is unlikely given the position 
and distance to late- type Kepler dwarfs. 
Our analysis of the g — r colors of planet hosts contra- 



dicts the results of Schlauf man fc La ughlin (2011), who 
find a 4er difference between g — r colors of late-type 
exoplanet hosts and stars with no exoplanets present. 
Their result is most likely an artifact of the large num- 
ber of stars which were misclassifie d as dwarfs in the 
KIC. ISchlaufman fc Laughlinl (|2011[) state that their re- 
sult can be reproduced if their sample of KIC log g > 4 
stars is between 10% and 30% giants, which they calcu- 
late by adding stars with KIC log g < 4 stars into their 
control sample, and measuring the resulting g — r color 
offset. We find that the giant fraction is above 10% for 
this color range. Further, if the KIC log g > 4 sample 
is significantly contaminated with giants, the sample will 
have bluer colors than a true dwarf sample. Adding ad- 
ditional giants to this contaminated sample will create 
smaller changes in the overall color of the sample than 
if it had contained only dwarf stars. Thus many more 
giants will be required to produce a given offset, creating 
an artificially high estimate for the level of giant contam- 
ination required to produce the offset. 

Although the g — r colors of exoplanet hosts in our 
sample are consistent with our dwarf sample, we can- 
not rule out small offsets (< 0.05) in g — r color. It is 
possible that any metallicity effect is sufficiently small 
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that it is diluted to non-detection by the large number 
of undetected exoplanets in the dwarf sample. As Ke- 
pler continues to discover planets of smaller radii and 
at larger orbital periods, the answer may become more 
clear. 
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